47 research outputs found
Escaping Saddle Points in Heterogeneous Federated Learning via Distributed SGD with Communication Compression
We consider the problem of finding second-order stationary points of
heterogeneous federated learning (FL). Previous works in FL mostly focus on
first-order convergence guarantees, which do not rule out the scenario of
unstable saddle points. Meanwhile, it is a key bottleneck of FL to achieve
communication efficiency without compensating the learning accuracy, especially
when local data are highly heterogeneous across different clients. Given this,
we propose a novel algorithm Power-EF that only communicates compressed
information via a novel error-feedback scheme. To our knowledge, Power-EF is
the first distributed and compressed SGD algorithm that provably escapes saddle
points in heterogeneous FL without any data homogeneity assumptions. In
particular, Power-EF improves to second-order stationary points after visiting
first-order (possibly saddle) points, using additional gradient queries and
communication rounds only of almost the same order required by first-order
convergence, and the convergence rate exhibits a linear speedup in terms of the
number of workers. Our theory improves/recovers previous results, while
extending to much more tolerant settings on the local data. Numerical
experiments are provided to complement the theory.Comment: 27 page